Paper Template for INTERSPEECH 2005 – Eurospeech, Lisboa
نویسندگان
چکیده
Annotated speech corpora are indispensable to various areas of speech research. In this paper, we present a novel discriminative training approach for HMM-based automatic phonetic segmentation. The objective of the proposed minimum boundary error (MBE) discriminative training approach is to minimize the expected boundary errors over a set of phonetic alignments represented as a phonetic lattice. This approach is inspired by the recently proposed minimum phone error (MPE) training algorithm for automatic speech recognition. To evaluate the MBE training approach, we conducted automatic phonetic segmentation experiments on the TIMIT acoustic-phonetic continuous speech corpus. The MBE-trained HMMs can identify 79.75% of human-labeled phone boundaries within a tolerance of 10 ms, compared to 71.23% identified by the conventional ML-trained HMMs. Moreover, by using the MBE-trained HMMs, only 7.89% of automatically labeled phone boundaries have errors larger than 20 ms.
منابع مشابه
Transcribing lectures and seminars
This paper describes recent research carried out in the context of the FP6 Integrated Project CHIL in developing a system to automatically transcribe lectures and seminars. We made use of widely available corpora to train both the acoustic and language models, since only a small amount of CHIL data were available for system development. For acoustic model training made use of the transcribed po...
متن کاملAuditory visual speech processing
This paper provides an overview of the developments in Auditory Visual Speech Processing, a Special Interest Group within Eurospeech. I hope that this discussion will be informative and useful to readers in a variety of fields, including psychology, speech science, animation, psycholinguistics, human-machine interaction, hearingimpaired communication, and numerous other fields which also share ...
متن کاملRapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments
This paper describes a multi-template unsupervised speaker adaptation based on HMM-Sufficient Statistics. Multiple class-dependent models based on gender and age are used to push up the adaptation performance while keeping adaptation time within few seconds with just one arbitrary utterance. Adaptation begins with the estimation of speaker‘s class from the N-best neighbor speakers using Gaussia...
متن کاملBeyond Utterance Extraction: Summary Recombination for Speech Summarization
This paper describes a template filling approach for creating conversation summaries. The templates are generated from generalized summary fragments from a training corpus. Necessary pieces of information for filling them are extracted automatically from the conversation transcripts given linguistic features, and drive the fragment selection process. The approach obtains ROUGE-2 scores of 0.084...
متن کاملPhylogenetic relationships of the north-eastern Atlantic and Mediterranean blenniids
1 Unidade de Investigação em Eco-etologia, Instituto Superior de Psicologia Aplicada, Rua do Jardim do Tabaco, 34, 1149-041, Lisboa, Portugal 2 Instituto de Oceanografia, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal 3 USVE, Institut National de la Recherche Agronomique, BP 2078, 06606 Antibes cedex, France 4 Centro de Ciências do Mar do Algarve (CCMAR...
متن کامل